g
p
therefore treated as an essential gene cluster. Several density
n algorithms have been developed for identifying essential genes
some genome-wise transposon statistic [Langridge, et al., 2009;
t al., 2012; Yang, et al., 2017]. Different models for essential
covery are based on different assumptions of the nature of
on insertion profile. TraDIS discovers essential genes through
g a density function based on the transposon insertion sites per
istic [Langridge, et al., 2009]. ESSENTIALS discovers essential
rough estimating a density function based on the transposon
s per gene statistic [Zomer, et al., 2012]. DEM (distal effect
iscovers essential genes through estimating a density function
the mutation feature statistic which is the convolution between
poson insertions per gene and the transposon insertion sites per
ng, et al., 2017].
ddition to density estimation, cluster analysis can also be
ed when dealing with the problems for separating essential genes
-essential genes, especially in the context of multivariate analysis.
ity estimation
en, the future inference is required based on the knowledge
rom collected experimental data through a pattern discovery and
process. Various density estimation approaches are such a process
nstructing an unknown data distribution from which data are
to be sampled or drawn [Silverman, 1986; Duda, et al., 2000].
nce, a mean value and a standard deviation value can be estimated
a set if the data set is assumed to be a sample drawn from a
distribution. After these two parameters have been well-
d, a distribution model of the data set can be constructed and can
or the future inference for novel data.
uld be noted that an estimated density based on a collected data
have some deviation from the expectation. This is not a surprise
a drawn sample always has a much smaller size than a whole